{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 5.4 FETI-DP in NGSolve III: Using Non-Point Constraints\n",
    "\n",
    "We show how to use FETI-DP with point- and edge-constraints.\n",
    "\n",
    "This includes setting up the primal edge-constraints, but not the implementation\n",
    "of the dual-primal space inverse operator."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "num_procs = '40'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from usrmeeting_jupyterstuff import *"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "stop_cluster()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Waiting for connection file: ~/.ipython/profile_ngsolve/security/ipcontroller-kogler-client.json\n",
      "connecting ... try:6 succeeded!"
     ]
    }
   ],
   "source": [
    "start_cluster(num_procs)\n",
    "connect_cluster()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[stdout:31] \n",
      "global,  ndof = 75403 , lodofs = 10355\n",
      "avg DOFs per core:  2260.7\n"
     ]
    }
   ],
   "source": [
    "%%px\n",
    "from ngsolve import *\n",
    "import netgen.meshing as ngmeshing\n",
    "from ngsolve.la import SparseMatrixd, ParallelMatrix, ParallelDofs\n",
    "from ngsolve.la import FETI_Jump, DISTRIBUTED, CUMULATED\n",
    "\n",
    "nref = 0\n",
    "\n",
    "dim=3\n",
    "ngmesh = ngmeshing.Mesh(dim=dim)\n",
    "ngmesh.Load('cube.vol')\n",
    "for l in range(nref):\n",
    "    ngmesh.Refine()\n",
    "mesh = Mesh(ngmesh)\n",
    "comm = MPI_Init()\n",
    "fes = H1(mesh, order=2, dirichlet='right|top|top|left')\n",
    "a = BilinearForm(fes)\n",
    "u,v = fes.TnT()\n",
    "a += SymbolicBFI(grad(u)*grad(v))\n",
    "a.Assemble()\n",
    "f = LinearForm(fes)\n",
    "f += SymbolicLFI(x*y*v)\n",
    "f.Assemble()\n",
    "pardofs = fes.ParallelDofs()\n",
    "avg_dof = comm.Sum(fes.ndof) / comm.size\n",
    "if comm.rank==0:\n",
    "    print('global,  ndof =', fes.ndofglobal, ', lodofs =', fes.lospace.ndofglobal)\n",
    "    print('avg DOFs per core: ', avg_dof)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## A new dual-primal space\n",
    "\n",
    "\n",
    "If we additionally introduce edge-constraints to the old dual-primal space $\\widetilde{V}_{\\scriptscriptstyle DP}$,\n",
    "we get out new dual-primal space :\n",
    "$$\n",
    "\\widetilde{V}_{\\scriptscriptstyle DP} = \\{u\\in V_{\\scriptscriptstyle DP}: \\{u\\}_e \\text{ is continuous } \\forall e\\in\\mathcal{E}\\}\n",
    "$$\n",
    "\n",
    "We already have $A_{\\scriptscriptstyle DP}$ for the dual-primal space without edge-constraints.\n",
    "We can simply reuse that.\n",
    "\n",
    "However, the **inverse** operation, $b\\rightarrow A_{\\scriptscriptstyle DP}^{-1}b$ is different.\n",
    "\n",
    "In other words, we have to find a $u\\in \\widetilde{V}_{\\scriptscriptstyle DP}$, such that\n",
    "$$\n",
    "u = \\text{argmin}_{v\\in \\widetilde{V}_{\\scriptscriptstyle DP}}\\left[\\frac{1}{2}\\left<Au,u\\right> - \\left<b,u\\right>\\right]\n",
    "$$\n",
    "\n",
    "In our implementation, instead of using a transformation of basis and explicitely working on\n",
    "$\\widetilde{V}_{\\scriptscriptstyle DP}$, we instead work in $V_{\\scriptscriptstyle DP}$ and\n",
    "solve a saddle point problem.\n",
    "\n",
    "More preciely, in order for $u$ to be in $\\widetilde{V}_{\\scriptscriptstyle DP}$, there has to be some\n",
    "vector $w\\in\\mathbb{R}^m$ (with $m=\\#\\text{subdomain edges}$) with\n",
    "$$\n",
    "\\{u^{(i)}\\}_{e_k} = w_k \\quad\\forall i\\text{ such that } e\\in\\Omega_i\n",
    "$$\n",
    "\n",
    "This just means that for every edge there is a single value that equals the edge-average of $u$ for\n",
    "**all** neighbouring subdomains.\n",
    "\n",
    "Now, with local edge-average matrices $B^{(i)}_p$ and restriction matrices $R^{(i)}$\n",
    "$$\n",
    "u\\in\\widetilde{V}_{\\scriptscriptstyle DP} \\Leftrightarrow u\\in V_{\\scriptscriptstyle DP} \\wedge \n",
    "\\exists w\\in\\mathbb{R}^m: B^{(i)}_p u^{(i)} = R^{(i)}w ~ \\forall i\n",
    "$$\n",
    "\n",
    "We can now incorporate the constraint $ B_p u = w $ by going to a saddle point problem, and now have to solve:\n",
    "\n",
    "$$\n",
    "\\left(\\begin{matrix}\n",
    "A & 0 & B_p^T \\\\\n",
    "0 & 0 & -R^T \\\\\n",
    "B_p & -R & 0\n",
    "\\end{matrix}\\right)\n",
    "\\left(\\begin{matrix}\n",
    "u \\\\\n",
    "w \\\\\n",
    "\\mu\n",
    "\\end{matrix}\\right)\n",
    "=\n",
    "\\left(\\begin{matrix}\n",
    "b \\\\\n",
    "0 \\\\\n",
    "0\n",
    "\\end{matrix}\\right)\n",
    "$$\n",
    "\n",
    "We can eliminate all dual DOFs, as well as the lagrange parameters $\\mu$ by\n",
    "**locally** eliminating the matrix block:\n",
    "\n",
    "$$\n",
    "\\left(\\begin{matrix}\n",
    "A^{(i)}_{\\Delta\\Delta} & {B_p^{(i)}}^{T}\\\\\n",
    "B_p^{(i)} & 0\n",
    "\\end{matrix}\\right)\n",
    "$$\n",
    "\n",
    "(Note: If we use point-constraints, the $A^{(i)}_{\\Delta\\Delta}$ are invertible)\n",
    "\n",
    "We end up with a global problem for $(u_\\pi, w)^T$ of size\n",
    "\n",
    "    #subdomain vertices + #subdomain edges\n",
    "\n",
    "The exact implementation is a bit tedious.\n",
    "\n",
    "What we have to supply to be able to use it is:\n",
    "  - A matrix $B_p^{(i)}$ for each subdomain. We will use simple algebraic averages,\n",
    "    in which case this is just a sparse matrix filled with ones.\n",
    "  - ParallelDofs for the space $w$-lives in.\n",
    "\n",
    "Finding the subdomain vertices works mostly as before:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%px\n",
    "pythonic = True\n",
    "if pythonic:\n",
    "    faces = [set(d for d in pardofs.Proc2Dof(p) if d<mesh.nv and fes.FreeDofs()[d] ) for p in pardofs.ExchangeProcs()]\n",
    "    edges = sorted([tuple(sorted(e)) for e in set(tuple(f1.intersection(f2)) for f1 in faces for f2 in faces if f1 is not f2) if len(e)>1])\n",
    "    vertices = sorted(set([ v for e1 in edges for e2 in edges if e1 is not e2 for v in set(e1).intersection(set(e2)) ]))\n",
    "else:\n",
    "    faces = []\n",
    "    for p in pardofs.ExchangeProcs():\n",
    "        faces.append(set(d for d in pardofs.Proc2Dof(p) if d<mesh.nv and fes.FreeDofs()[d]))\n",
    "    edges = []\n",
    "    for f1 in faces:\n",
    "        for f2 in faces:\n",
    "            if f1 is not f2:\n",
    "                edge = sorted(tuple(f1.intersection(f2)))\n",
    "                if len(edge) > 1:\n",
    "                    if not edge in edges:\n",
    "                        edges.append(sorted(tuple(edge)))\n",
    "    vertices = set()\n",
    "    for e1 in edges:\n",
    "        for e2 in edges:\n",
    "            if e1 is not e2:\n",
    "                vs = set(e1).intersection(set(e2))\n",
    "                vertices = vertices.union(vs)\n",
    "    vertices = sorted(vertices)\n",
    "\n",
    "vec = f.vec.CreateVector()\n",
    "vec.local_vec[:] = 0.0\n",
    "for v in vertices:\n",
    "    vec.local_vec[v] = 1\n",
    "vec.SetParallelStatus(DISTRIBUTED)\n",
    "vec.Cumulate()\n",
    "vertices = [ v for v in range(mesh.nv) if vec.local_vec[v]!=0 ]\n",
    "\n",
    "primal_dofs = BitArray([v in vertices for v in range(fes.ndof)]) & fes.FreeDofs()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We only make one small adjustment:\n",
    "\n",
    "  - we do not want any subdomain vertices do be included in the edges\n",
    "  - (for consistency) we want neither subdomain vertices nor subdomain edge-DOFs in the faces"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%px\n",
    "all_e = set.union(*[set(f) for f in faces]) if len(faces) else {}\n",
    "faces2 = [[v for v in f if not v in all_e] for f in faces]\n",
    "faces = [f for f in faces2 if len(f)]\n",
    "\n",
    "\n",
    "edges2 = [[v for v in e if not v in vertices] for e in edges] \n",
    "edges = [e for e in edges2 if len(e)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We still need the same ParallelDofs as before for $V_{\\scriptscriptstyle DP}$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[stdout:31] \n",
      "# primal dofs global:  122\n",
      "min, avg, max per rank:  4   12.425   25\n"
     ]
    }
   ],
   "source": [
    "%%px\n",
    "primal_dofs = BitArray([ v in set(vertices) for v in range(fes.ndof) ]) & fes.FreeDofs()\n",
    "dp_pardofs = pardofs.SubSet(primal_dofs)\n",
    "\n",
    "nprim = comm.Sum(sum([1 for k in range(fes.ndof) if primal_dofs[k] and comm.rank<pardofs.Dof2Proc(k)[0] ]))\n",
    "npmin = comm.Min(primal_dofs.NumSet() if comm.rank else nprim)\n",
    "npavg = comm.Sum(primal_dofs.NumSet())/comm.size\n",
    "npmax = comm.Max(primal_dofs.NumSet())\n",
    "if comm.rank==0:\n",
    "    print('# primal dofs global: ', nprim)  \n",
    "    print('min, avg, max per rank: ', npmin, ' ', npavg, ' ', npmax)\n",
    "\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setting up the Edge-Constraint Matrix\n",
    "Now that we know the edges, it is easy to construct $B_p$ from\n",
    "COO format. Using generator expressions, it can be done in three lines."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%px\n",
    "ar = [(num_e[0],d,1.0) for num_e in enumerate(edges) for d in num_e[1] ]\n",
    "rows, cols, vals = [list(x) for x in zip(*ar)] if len(ar) else [[],[],[]]\n",
    "B_p = SparseMatrixd.CreateFromCOO(rows, cols, vals, len(edges), fes.ndof)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## ParallelDofs for w\n",
    "\n",
    "This is also pretty simple. $w$ has one DOF for each subdomain edge, and each edge's\n",
    "DOF is shared by the set of all procs that share **all** DOFs in the entire edge."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[stdout:31] \n",
      "# edge constraints global:  227\n",
      "min, avg, max:  9   17.025   29\n"
     ]
    }
   ],
   "source": [
    "%%px\n",
    "edist_procs = [sorted(set.intersection(*[set(pardofs.Dof2Proc(v)) for v in edge])) for edge in edges]\n",
    "eavg_pardofs = ParallelDofs(edist_procs, comm)\n",
    "\n",
    "neavg = comm.Sum(sum([1 for ps in edist_procs if comm.rank<ps[0]]))\n",
    "neavg_min = comm.Min(len(edges) if comm.rank else neavg)\n",
    "neavg_avg = comm.Sum(len(edges)) / comm.size\n",
    "neavg_max = comm.Max(len(edges))\n",
    "if comm.rank==0:\n",
    "    print('# edge constraints global: ', neavg)\n",
    "    print('min, avg, max: ', neavg_min, ' ', neavg_avg, ' ', neavg_max)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Getting the Dual-Primal Inverse\n",
    "DPSpace_Inverse needs:\n",
    "\n",
    " - the original matrix matrix\n",
    " - a freedofs-BitArray\n",
    " - a BitArray for the point-constraints\n",
    " - the constraint matrix, and paralleldofs for $w$\n",
    " - the inversetype to use for local problems\n",
    " - the inversetype to use for the global coarse grid problem"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%px\n",
    "from dd_toolbox import DPSpace_Inverse\n",
    "A_dp_inv = DPSpace_Inverse(mat=a.mat, freedofs=fes.FreeDofs(), \\\n",
    "                           c_points=primal_dofs, \\\n",
    "                           c_mat=B_p, c_pardofs=eavg_pardofs, \\\n",
    "                           invtype_loc='sparsecholesky', \\\n",
    "                           invtype_glob='masterinverse')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We are not using an explicit tranformation of basis, therefore:\n",
    "# Everything else works the same - experiment below!!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Waiting for connection file: ~/.ipython/profile_ngsolve/security/ipcontroller-kogler-client.json\n",
      "connecting ... try:6 succeeded!"
     ]
    }
   ],
   "source": [
    "from usrmeeting_jupyterstuff import *\n",
    "num_procs = '100'\n",
    "stop_cluster()\n",
    "start_cluster(num_procs)\n",
    "connect_cluster()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%px\n",
    "from ngsolve import *\n",
    "import netgen.meshing as ngmeshing\n",
    "from ngsolve.la import ParallelMatrix, FETI_Jump, SparseMatrixd, ParallelDofs\n",
    "from dd_toolbox import FindFEV, DPSpace_Inverse, ScaledMat\n",
    "\n",
    "def load_mesh(nref=0):\n",
    "    ngmesh = ngmeshing.Mesh(dim=3)\n",
    "    ngmesh.Load('cube.vol')\n",
    "    for l in range(nref):\n",
    "        ngmesh.Refine()\n",
    "    return Mesh(ngmesh)\n",
    "\n",
    "def setup_space(mesh, order=1):\n",
    "    comm = MPI_Init()\n",
    "    fes = H1(mesh, order=order, dirichlet='right|top')\n",
    "    a = BilinearForm(fes)\n",
    "    u,v = fes.TnT()\n",
    "    a += SymbolicBFI(grad(u)*grad(v))\n",
    "    a.Assemble()\n",
    "    f = LinearForm(fes)\n",
    "    f += SymbolicLFI(x*y*v)\n",
    "    f.Assemble()\n",
    "    avg_dof = comm.Sum(fes.ndof) / comm.size\n",
    "    if comm.rank==0:\n",
    "        print('global,  ndof =', fes.ndofglobal, ', lodofs =', fes.lospace.ndofglobal)\n",
    "        print('avg DOFs per core: ', avg_dof)\n",
    "    return [fes, a, f]\n",
    "\n",
    "def setup_FETIDP(fes, a):\n",
    "    faces, edges, vertices = FindFEV(mesh.dim, mesh.nv, \\\n",
    "                                     fes.ParallelDofs(), fes.FreeDofs())\n",
    "    primal_dofs = BitArray([ v in set(vertices) for v in range(fes.ndof) ]) & fes.FreeDofs() \n",
    "    dp_pardofs = fes.ParallelDofs().SubSet(primal_dofs)\n",
    "    ar = [(num_e[0],d,1.0) for num_e in enumerate(edges) for d in num_e[1] ]\n",
    "    rows, cols, vals = [list(x) for x in zip(*ar)] if len(ar) else [[],[],[]]\n",
    "    B_p = SparseMatrixd.CreateFromCOO(rows, cols, vals, len(edges), fes.ndof)\n",
    "    edist_procs = [sorted(set.intersection(*[set(fes.ParallelDofs().Dof2Proc(v)) for v in edge])) for edge in edges]\n",
    "    eavg_pardofs = ParallelDofs(edist_procs, comm)\n",
    "    nprim = comm.Sum(sum([1 for k in range(fes.ndof) if primal_dofs[k] and comm.rank<fes.ParallelDofs().Dof2Proc(k)[0] ]))\n",
    "    if comm.rank==0:\n",
    "        print('# of global primal dofs: ', nprim)  \n",
    "    A_dp = ParallelMatrix(a.mat.local_mat, dp_pardofs)\n",
    "    dual_pardofs = fes.ParallelDofs().SubSet(BitArray(~primal_dofs & fes.FreeDofs()))\n",
    "    B = FETI_Jump(dual_pardofs, u_pardofs=dp_pardofs)\n",
    "    if comm.rank==0:\n",
    "        print('# of global multipliers = :', B.col_pardofs.ndofglobal)\n",
    "    A_dp_inv = DPSpace_Inverse(mat=a.mat, freedofs=fes.FreeDofs(), \\\n",
    "                               c_points=primal_dofs, \\\n",
    "                               c_mat=B_p, c_pardofs=eavg_pardofs, \\\n",
    "                               invtype_loc='sparsecholesky', \\\n",
    "                               invtype_glob='masterinverse')\n",
    "    F = B @ A_dp_inv @ B.T\n",
    "    innerdofs = BitArray([len(fes.ParallelDofs().Dof2Proc(k))==0 for k in range(fes.ndof)]) & fes.FreeDofs()\n",
    "    A = a.mat.local_mat\n",
    "    Aiinv = A.Inverse(innerdofs, inverse='sparsecholesky')\n",
    "    scaledA = ScaledMat(A, [1.0/(1+len(fes.ParallelDofs().Dof2Proc(k))) for k in range(fes.ndof)])\n",
    "    scaledBT = ScaledMat(B.T, [1.0/(1+len(fes.ParallelDofs().Dof2Proc(k))) for k in range(fes.ndof)])\n",
    "    Fhat = B @ scaledA @ (IdentityMatrix() - Aiinv @ A) @ scaledBT\n",
    "    return [A_dp, A_dp_inv, F, Fhat, B, scaledA, scaledBT]\n",
    "    \n",
    "def prep(B, Ainv, f):\n",
    "    rhs.data = (B @ Ainv) * f.vec\n",
    "    return rhs\n",
    "\n",
    "def solve(mat, pre, rhs, sol):\n",
    "    t = comm.WTime()\n",
    "    solvers.CG(mat=mat, pre=pre, rhs=rhs, sol=sol, \\\n",
    "               maxsteps=100, printrates=comm.rank==0, tol=1e-6)\n",
    "    return comm.WTime() - t\n",
    "    \n",
    "def post(B, Ainv, gfu, lam):\n",
    "    hv = B.CreateRowVector()\n",
    "    hv.data = f.vec - B.T * lam\n",
    "    gfu.vec.data = Ainv * hv\n",
    "    jump = lam.CreateVector()\n",
    "    jump.data = B * gfu.vec\n",
    "    norm_jump = Norm(jump)\n",
    "    if comm.rank==0:\n",
    "        print('\\nnorm jump u: ', norm_jump)   "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[stdout:16] \n",
      "global,  ndof = 576877 , lodofs = 75403\n",
      "avg DOFs per core:  6607.56\n",
      "# of global primal dofs:  399\n",
      "# of global multipliers = : 87867\n",
      "\n",
      "it =  0  err =  0.12130848690672913\n",
      "it =  1  err =  0.02390767825931742\n",
      "it =  2  err =  0.00825023321167997\n",
      "it =  3  err =  0.002729895619289229\n",
      "it =  4  err =  0.0008641121927087105\n",
      "it =  5  err =  0.0003048442822355854\n",
      "it =  6  err =  9.712889856412031e-05\n",
      "it =  7  err =  3.3730326402392145e-05\n",
      "it =  8  err =  1.0925387588517252e-05\n",
      "it =  9  err =  3.32252185091653e-06\n",
      "it =  10  err =  1.067131618337071e-06\n",
      "it =  11  err =  3.318060576723047e-07\n",
      "\n",
      "time solve:  0.5276237779762596\n",
      "dofs per core and second:  10933.491326199415\n",
      "\n",
      "norm jump u:  2.0337283556173004e-06\n"
     ]
    }
   ],
   "source": [
    "%%px\n",
    "comm = MPI_Init()\n",
    "mesh = load_mesh(nref=1)\n",
    "fes, a, f = setup_space(mesh, order=2)\n",
    "A_dp, A_dp_inv, F, Fhat, B, scaledA, scaledBT = setup_FETIDP(fes, a)\n",
    "rhs = B.CreateColVector()\n",
    "lam = B.CreateColVector()\n",
    "prep(B, A_dp_inv, f)\n",
    "if comm.rank==0:\n",
    "    print('')\n",
    "t = solve(F,  Fhat,  rhs, lam)\n",
    "if comm.rank==0:\n",
    "    print('\\ntime solve: ', t)\n",
    "    print('dofs per core and second: ', fes.ndofglobal / (t * comm.size))    \n",
    "gfu = GridFunction(fes)\n",
    "post(B, A_dp_inv, gfu, lam)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SparseCholesky<d,d,d>::MultAdd :   1.095705270767212\n",
      "SparseCholesky<d,d,d>::MultAdd fac1 :   0.6547586917877197\n"
     ]
    }
   ],
   "source": [
    "%%px --target 1\n",
    "for t in sorted(filter(lambda t:t['time']>0.5, Timers()), key=lambda t:t['time'], reverse=True):\n",
    "    print(t['name'], ':  ', t['time'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[stdout:2] \n",
      "timers from rank  13 :\n",
      "SparseCholesky<d,d,d>::MultAdd :   1.214454174041748\n",
      "SparseCholesky<d,d,d>::MultAdd fac1 :   0.7252676486968994\n",
      "SparseCholesky<d,d,d>::MultAdd fac2 :   0.47362494468688965\n",
      "SparseCholesky - total :   0.42598485946655273\n"
     ]
    }
   ],
   "source": [
    "%%px\n",
    "t_chol = filter(lambda t: t['name'] == 'SparseCholesky<d,d,d>::MultAdd', Timers()).__next__()\n",
    "maxt = comm.Max(t_chol['time']) \n",
    "if t_chol['time'] == maxt:\n",
    "    print('timers from rank ', comm.rank, ':')\n",
    "    for t in sorted(filter(lambda t:t['time']>min(0.3*maxt, 0.5), Timers()), key=lambda t:t['time'], reverse=True):\n",
    "        print(t['name'], ':  ', t['time'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "stop_cluster()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
